The Truth About Effects, Capabilities and Effect Subset Constraints
An effect or capability is a set of (compile-time) region names. We use effect to refer to the region names that must be “live” for some expression to type-check and capability to refer to the region names that are “live” at some program point. A effect subset constraint indicates that all region names that appear in one effect qualifier also appear in another effect qualifier. Each program point has a set of “known effect subset relations”.
The intuition is that a program point’s capability and subset relations must imply that an expression’s effect describes live regions, else the expression does not type-check. As we’ll see, default effects for functions were carefully designed so that most Cyclone code runs no risk of such an “effect check” ever failing. But using existential types effectively requires a more complete understanding of the system, though perhaps not as complete as this section presents.
The form of effects or capabilities is described as follows:
- {} is the empty set. At most the heap region is accessed by an expression having this effect.
r is the set containing the indivisible effect
r. This effect variable can be isntantiated with a set consisting of one or more region names.- e1 + e2 is the set containing the effects e1 and e2. That is, we write + for set-union.
- regions(t), where t is a type is the set containing all of the
region names contained in t and regions(
a) for all type variables
a contained in t.
The description of regions(t) appears circular, but in fact if we gave
the definition for each form of types, it would not be. The point is
that regions(a) is an "atomic effect" in the sense that it stands for
a set of regions that cannot be further decomposed without knowing
a. The primary uses of regions(t) are descibed below.
The form of an effect subset constraint is e1 <= e2 where e1 and e2 are both effects.
We now describe the capability at each program point. On function entry, the capability is the function’s effect (typically the regions of the parameters and result, but fully described below). In a local block or a growable-region statement, the capability is the capability of the enclosing context plus the block/statement’s region name.
The known effect subset relation at a program point are only those that are mentioned in the type of the function within which the program point occurs.
We can now describe an expression’s effect: If it reads or writes to
memory described by a region name r, then the effect contains
{
r}. If it calls a function with effect e, then the effect conatins
e. Every function (type) has an effect, but programmers almost never
write down an explicit effect. To do so, one puts “; e” at the end of
the parameter list, wehre e is an effect. For example, we could write:
`a id(`a x; {}) { return x; }
because the function does not access any memory.
If a function takes parameters of types t1,…,tn and returns a value
of type t, the default effect is simply
regions(t1)+…+regions(tn)+regions(t). In English, the default
assumption is that a function may dereference any pointers it is
passed, so the corresponding regions must be live. In our example
above, the default effect would have been regions(a). If the caller
had instantiated
a with int*r, then with the default effect, the
type-checker would require
r to be live, but with the explicit effect
{} it would not. However, dangling pointers can be created only when
using existential types, so the difference is rarely noticed.
By default, a function (type) has no effect subset constraints. That is, the function assumes no relationship between all the effect variables that appear in its type. Adding explicit subset relationships enables more subtyping in the callee and more stringent requirements at the call site (namely that the relationship holds).
We can describe when a capability e and a set of effect subset
relations s imply an effect, although your intuition probably
suffices. A “normalized effect” is either {} or unions of “atomic
effects”, where an atomic effect is either {r} or regions(
a). For
any effect e1, we can easily compute an equivalent normalized effect
e2. Now, e and s imply e1 if they imply every {r} and regions(
a) in
e2. To imply {r} (or regions(
a)), we need {r} (or regions(
a)) to
be in (a normalized effect of) e) or for b to contain some r <=
r2
such that e and b imply `r2. Or something like that.
All of these complications are unnecessary except for existential types, to which we now return. Explicit bounds are usually necessary to make values of existential types usable. To see why, consider the example from the previous section, with the struct declaration changed to remove the explicit bound:
struct T {<`a>
`a f1;
int f(`a, int);
};
(So the declaration of t should just have type struct T.) Now the
function call f(arg,19) at the end of g will not type-check because
the (default) effect for f includes regions(b), which cannot be
established at the call site. But with the bound, we know that
regions(
b) <=`H, which is sufficient to prove the call won’t read
through any dangling pointers.